Coinciding Walk Kernels: Parallel Absorbing Random Walks for Learning with Graphs and Few Labels
نویسندگان
چکیده
Exploiting autocorrelation for node-label prediction in networked data has led to great success. However, when dealing with sparsely labeled networks, common in present-day tasks, the autocorrelation assumption is difficult to exploit. Taking a step beyond, we propose the coinciding walk kernel (cwk), a novel kernel leveraging label-structure similarity – the idea that nodes with similarly arranged labels in their local neighbourhoods are likely to have the same label – for learning problems on partially labeled graphs. Inspired by the success of random walk based schemes for the construction of graph kernels, cwk is defined in terms of the probability that the labels encountered during parallel random walks coincide. In addition to its intuitive probabilistic interpretation, coinciding walk kernels outperform existing kerneland walk-based methods on the task of node-label prediction in sparsely labeled graphs with high label-structure similarity. We also show that computing cwks is faster than many state-of-the-art kernels on graphs. We evaluate cwks on several realworld networks, including cocitation and coauthor graphs, as well as a graph of interlinked populated places extracted from the dbpedia knowledge base.
منابع مشابه
Coinciding Walk Kernels
Exploiting autocorrelation for node-label prediction in networked data has led to great success. However, when dealing with sparsely labeled networks, common in present-day tasks, the autocorrelation assumption is difficult to exploit. Taking a step beyond, we propose the coinciding walk kernel (cwk), a novel kernel leveraging label-structure similarity – the idea that nodes with similarly arra...
متن کاملRandom walks on simplicial complexes and harmonics†
In this paper, we introduce a class of random walks with absorbing states on simplicial complexes. Given a simplicial complex of dimension d, a random walk with an absorbing state is defined which relates to the spectrum of the k-dimensional Laplacian for 1 ≤ k ≤ d. We study an example of random walks on simplicial complexes in the context of a semi-supervised learning problem. Specifically, we...
متن کاملHalting in Random Walk Kernels
Random walk kernels measure graph similarity by counting matching walks in two graphs. In their most popular form of geometric random walk kernels, longer walks of length k are downweighted by a factor of λ (λ < 1) to ensure convergence of the corresponding geometric series. We know from the field of link prediction that this downweighting often leads to a phenomenon referred to as halting: Lon...
متن کاملLearning with Graphs using Kernels from Propagated Information
Traditional machine learning approaches are designed to learn from independent vector-valued data points.�e assumption that instances are independent, however, is not always true. On the contrary, there are numerous domains where data points are cross-linked, for example social networks, where persons are linked by friendship relations.�ese relations among data points make traditional machine l...
متن کاملSemi-supervised Learning over Heterogeneous Information Networks by Ensemble of Meta-graph Guided Random Walks
Heterogeneous information network (HIN) is a general representation of many real world data. The difference between HIN and traditional homogeneous network is that the nodes and edges in HIN are with types. In many applications, we need to consider the types to make the decision more semantically meaningful. For annotationexpensive applications, a natural way is to consider semi-supervised lear...
متن کامل